MSA-Ghoniem-MC2

Team Members:

Mohammad Ghoniem, Modern Sciences and Arts University, mohammad.ghoniem@gmail.com
Georgiy Shurkhovetskyy, Modern Sciences and Arts University, shurkhovetsky@gmail.com
Ahmed Bahey, Nile University, ahmed.bahey@nileu.edu.eg
Student Team: NO

Tool(s):

The InfoVis Toolkit (IVTK) v0.10 (by Jean-Daniel Fekete, available at sourceforge.net) for visualization of the data
Custom developments made on top of IVTK for the VAST 2012 challenge
Custom Python scripts for data parsing

Answers to Mini-Challenge 2 Questions:

MC 2.1 Using your visual analytics tools, can you identify what noteworthy events took place for the time period covered in the firewall and IDS logs? Provide screen shots of your visual analytics tools that highlight the five most noteworthy events of security concern, along with explanations of each event.

Analyzing the firewall and IDS data for both days using a matrix representing number of connections for each source IP at a half hour granularity led us to discovering different temporal and topological patterns in network activity. (figure 1)

Time Series visualization has been used to plot the number of connection for each source or destination IP encountered in the firewall or IDS log. In a custom-made aggregate version of time series each curve represents the aggregated connection numbers for every source/destination IP during each half hour time slice. A magic lens allows the investigator to drill down to details about the corresponding destination IPs being targeted, on which destination ports, and the hit distribution across those ports. (figure 2)

1. SSH traffic permitted

Parallel coordinates visualization (figure 3) displays the firewall log showing the relation between the source IP, destination port, operation and the destination IP of each communication recorded by the firewall. It can be noticed that although port 21 is blocked and all attempts to connect to it are denied, there are also many connections to port 22 (Secure Shell Port). Connections to 10.32.5.x websites are attempted but denied through port 21. Since SSH also supports file transfer, it may be used as a replacement for FTP in order to introduce malware. In fact, as can be seen on the detailed Time Series (figure 4) view filtered on destination port 22, the firewall log reports outbound SSH traffic from few internal IPs such as 172.23.23x.x and 172.23.127.x to these websites. Many of these IPs are also involved in IRC connections as explained below.

2. IRC traffic on port 6667

Visually inspecting the matrix (figure 5) one discovers a noteworthy temporal pattern for many IP addresses in the range 172.23.123.x-136.x. For most of these IPs the first day activity is sparse, unlike the second day. Switching to the aggregate time series visualization (figure 6) and applying filtering on the IP ranges contributing to the patterns spotted using the matrix, it appears that those IPs connect to external websites in the range 10.23.5.x on port 6667.

From the matrix view in figure 5 along with the time series view in figure 6, one can see that in the first day IPs in the range 172.23.127.x had communications through port 6667, while in the second day the activity pattern has spread to include the entire range 172.23.123.x-172.23.136.x. In the Bank of Money network employees are allowed to use computers only for business purposes, which means that chatting through IRC is prohibited. Therefore outbound traffic on port 6667 is suspicious. The firewall blocks all inbound connections to port 6667, but surprisingly allows outbound communications with websites on this port. Port 6667 is commonly used by many trojans. The spread of activity patterns from day 1 to day 2 in the logs within IP ranges of interest can probably be accounted for by a viral spread from infected workstations to other computers in the neighboring IP ranges. A similar viral spread phenomenon (figure 7)can be spotted for IPs in the range 172.23.23x.x and 172.23.24x.x, but their higher activity levels (dark vertical stripes in day 1 and all dark in day 2) may suggest that their infection date is even earlier than 172.23.123.x-172.23.136.x IPs.

Filtering the aggregated Time Series (figure 8) from the firewall log on 172.23.23x.x and 172.23.24x.x range confirms the existence of a flawless large scale synchronization between these workstations. The traffic carried out by them is targeting 10.32.5.x websites on 6667 and 80. That much synchronization cannot be the result of normal user activity. These computers are probably remote controlled by the 10.32.5.x websites, which orchestrate their activity through IRC messages. As a side note, the destination websites do only connect to the firewall through many different ports which is typically characteristic of a port scan.

3. Viral Spread through network shares

Looking at IDS log using parallel coordinates (figure 9) it appears that many captured connections on service ports were on 139 (Netbios) and 445 (Microsoft-Directory Services). Both ports are used for network sharing. These ports are known to be used by numerous worms and trojans. Viewing the IDS log using Time Series aggregated on source IP the IP 172.23.231.69 stands out for carrying out port scan of the DNS server on April 6 between 00:00 and 00:30 A.M. These connections are internal therefore they do not leave any trails in the firewall log. Relying on the IDS log alone it did not appear to be possible to establish a formal relationship between communications on ports 139/445 and the spread of suspicious behavior involving port 6667.

4. Datacenter traffic blocked at the firewall

Viewing the Time Series (figure 10) aggregated on source IP, one can see that IP 172.32.0.132 that initiates abnormally high number of connections, many of which target the data center. This IP also connects to the websites in the range 10.32.0-1.20x. In the matrix view the pattern can be observed involving the high-volume IP and a couple of other IPs in the range 172.23.0.x. Filtering on them in the Time Series reveals interesting synchronization in their activity in relation to the high-volume IP. For example, IPs 172.23.0.131 and 172.23.0.26 follow absolutely identical trends. In the second day IP 172.23.252.10 dominates the traffic, connecting to external websites in the range 10.32.5.x, on port 80 and 6667.

5. Uncharted IP range 172.28.29.x

Using Matrix visualization of the firewall logs, an uncharted IP range has been discovered. The local area network is supposed to include IPs in the range 172.23.x.x. However, there are about 10 IPs in the range 172.28.29.x that appear from 6:00 PM to around midnight in both days (visible both on the matrix and time series views). They communicate with the data center and websites in the range 10.32.0-1.20x. Both their mere existence and temporal activity pattern are suspicious.

MC 2.2 What security trend is apparent in the firewall and IDS logs over the course of the two days included here? Illustrate the identified trend with an informative and innovative visualization.

The main trend observed concerns the spread of IRC communications inside the LAN targeting 10.32.5.x websites. Matrix representations in figures 5 and 7, along with the magic lens + time series in figures 6 and 8 together capture this trend.

MC 2.3 What do you suspect is (are) the root cause(s) of the events identified in MC 2.1? Understanding that you cannot shut down the corporate network or disconnect it from the Internet, what actions should the network administrators take to mitigate the root cause problem(s)?

Although some ports like 21 (FTP) and 161 (SNMP) are denied by the firewall in both directions. Other equally critical ports received an asymetric treatment allowing them for outbound connections and denying them for inbound connections. These include ports 6667 (IRC) and 22 (SSH) which seem to be the main root cause for the security issues faced by the organization. Since it does not seem necessary for the bank's operations to allow such traffic, we recommend that the network administrator start by denying all incoming and outgoing activity on the above stated ports.

Also there has been a range from the set of allowed websites [10.32.5.50-59,68,69] which proved to be a great source of suspicious traffic and/or malware. Blocking ports 22 and 6667 may be enough to mitigate the threats coming from them. The network administrators should determine whether this specific range of websites is legit and necessary for the bank's employees. If this is not the case, they should be banned. In addition we believe that the financial transactions and webmail services should not physically reside on the same data center, because if any of them gets compromised it would affect the other.


Fig.1 Matrix visualization of source IP activity over time	Fig.2 Aggregated time series representation of outbound traffic


Fig.3 Parallel Coordinates visualization of 22 traffic	Fig.4 Detailed Time Series filtered on destination port 22


Fig.5 Visual pattern of IRC communications started on day 1 and spreading on day 2	Fig.6 Aggregated Time Series of 172.23.123.x-172.23.133.x reveal IRC traffic


Fig.7 Visual pattern of IRC communications started on day 1 and spreading on day 2	Fig.8 Aggregated Time Series of 172.23.23x.x-172.23.24x.x reveal IRC traffic


Fig.9 Parallel Coordinates visualization of 139/445 traffic	Fig.10 Detailed Time Series filtered on destination port 22